Semantic annotation for concept-based cross-language medical information retrieval

نویسندگان

  • Martin Volk
  • Bärbel Ripplinger
  • Spela Vintar
  • Paul Buitelaar
  • Diana Raileanu
  • Bogdan Sacaleanu
چکیده

We present a framework for concept-based cross-language information retrieval in the medical domain, which is under development in the MUCHMORE project. Our approach is based on using the Unified Medical Language System (UMLS) as the primary source of semantic data. Documents and queries are annotated with multiple layers of linguistic information. Linguistic processing includes part-of-speech tagging, morphological analysis, phrase recognition and the identification of medical terms and semantic relations between them. The paper describes experiments in monolingual and cross-language document retrieval, performed on a corpus of medical abstracts. Results show that linguistic processing, especially lemmatization and compound analysis for German, is a crucial step in achieving a good baseline performance. On the other hand, they show that semantic information, specifically the combined use of concepts and relations, increases the performance in monolingual and cross-language retrieval.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

A Systematic Evaluation of Concept-based Cross-Lingual Information Retrieval in the Medical Domain

The paper describes experiments and results of the MuchMore project1, which is concerned with a systematic comparison of concept-based and corpus-based methods in cross-language information retrieval (CLIR) in the medical domain. Primary goals of the project are to develop and evaluate methods for the effective use of multilingual thesauri in the semantic annotation of English and German medica...

متن کامل

Semantic relations in concept-based cross-language medical information retrieval

We explore and evaluate the usefulness of semantic annotation, particularly semantic relations, in cross-language information retrieval in the medical domain. As the baseline for automatic semantic annotation we use UMLS, which specifies semantic relations between medical concepts. We developed two methods to improve the accuracy and yield of relations in CLIR: a method for relation filtering a...

متن کامل

Cross-Lingual Medical Information Retrieval through Semantic Annotation

We present a framework for concept-based, cross-lingual information retrieval (CLIR) in the medical domain, which is under development in the MUCHMORE project. Our approach is based on using the Unified Medical Language System (UMLS) as the primary source of semantic data, whereby documents and queries are annotated with multiple layers of linguistic information. Linguistic processing includes ...

متن کامل

Ontologies in Cross-Language Information Retrieval

We present an approach to using ontologies as interlingua in cross-language information retrieval in the medical domain. Our approach is based on using the Unified Medical Language System (UMLS) as the primary ontology. Documents and queries are annotated with multiple layers of linguistic information (part-of-speech tags, lemmas, phrase chunks). Based on this we identify medical terms and sema...

متن کامل

Evaluation Resources for Concept-based Cross-Lingual Information Retrieval in the Medical Domain

The paper describes evaluation resources for concept-based, cross-lingual information retrieval in the medical domain. All resources were constructed in the context of the MuchMore project and are freely available through the project website. Available resources include: a bilingual, parallel document collection of German and English medical scientific abstracts, a set of queries and correspond...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • International journal of medical informatics

دوره 67 1-3  شماره 

صفحات  -

تاریخ انتشار 2002